Goto

Collaborating Authors

 enterprise-wide data ai initiative


10 Machine Learning Model Training Mistakes - AI Summary

#artificialintelligence

By Sandeep Uttamchandani, Ph.D., Both a Product/Software Builder (VP of Engg) & Leader in operating enterprise-wide Data/AI initiatives (CDO) In this article, I share the ten deadly sins during ML model training -- these are the most common as well as the easiest to overlook. During model training, there are scenarios when the loss-epoch graph keeps bouncing around and does not seem to converge irrespective of the number of epochs. There is no silver bullet as there are multiple root causes to investigate -- bad training examples, missing truths, changing data distributions, too high a learning rate. The most common one I have seen is bad training examples related to a combination of anomalous data and incorrect labels. The more the same data is used for parameter and hyperparameter settings, the lesser confidence that the results will actually generalize.


9 Deadly Sins of Machine Learning Dataset Selection - KDnuggets

#artificialintelligence

Let's start with an obvious fact: ML models can only be as good as the datasets that were used to build them! While there is a lot of emphasis on ML model building and algorithm selection, teams often do not pay enough attention to dataset selection! In my experience, investing time upfront in dataset selection saves endless hours later during model debugging and production rollout. Based on the ML model being built, outliers can either be a noise to ignore or important to take into account. Outliers arising from collection errors are the ones that need to be ignored.